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INHIBITORS OF NUCLEAR PROTEIN: NUCLEAR RECEPTOR INTERACTION 

The present invention relates to inhibition of the interaction between nuclear proteins 
and nuclear receptors through identification of the key structural element responsible for the 
5 interaction. 

The binding of lipophilic hormones, retinoids and vitamins to members of the nuclear 
receptor (NR) superfamily (to form “liganded” receptors) modifies their DNA binding and 
transcriptional properties, resulting in the activation or repression of target genes ■ Ligand 
binding induces conformational changes in NRs and promotes their association with a diverse 
10 group of nuclear proteins, including SRC- 1 /pi 60 3.4.5^ xif2 6.7 and CBP/p300 4.5.89 which 
function as coactivators, and RIP-140 'o, TIFl i ' and TRlPl/SUGl 12. 13 whose functions are 
unclear. 

The recruitment of nuclear proteins (coactivators and/or other so-called bridging 
proteins) by NRs is thought to be essential to their function as ligand-induced transcription 
1 5 factors. Structural studies of the ligand binding domains (LBDs) of three different nuclear 
hormone receptors, the retinoid X receptor a (RXRa) I5. the retinoic acid receptor y (RARy) 

'6 and the thyroid hormone receptor P (TRP) '7, have led to the proposal that binding of 
ligand results in a realignment of a conserved amphipathic a-helix, Helix 12 (H12), 
generating a novel surface required for coactivator binding and consequently activator 
20 function 2 (AF2)-dependent transactivation. Consistent with this, mutations of conserved 
hydrophobic residues in H12 which impair AF2 >4. 18-20, also interfere with the ability of NRs 
to bind coactivators 4.6.io.ii.i3. Less is known about the coactivator sequences which mediate 
interaction with NRs although several proteins appear to contain multiple NR binding sites 
3.8.21. Le Douarin et al (1996) in EMBO Journal, j_5, 6701-6715, identified a leucine rich 
25 region in three coactivators (TlFl, RIP 140 & TRIP3) which they called the “NR box”; see 
Figure 3D therein. However the present state of knowledge is completely silent about 
precisely how liganded nuclear receptors interact with nuclear proteins as a class to modify 
their DNA binding and transcriptional properties, resulting in the activation or repression of 
target genes. Indeed a commentator on the field stated, after the first filing date of the present 
30 invention, that '"characterizing the mechanisms by which nuclear factors engage the 

transcriptional apparatus in response to hormonal stimulation has seemed, at times, to be an 
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insurmountable task" (Marc Montminy in Nature, n"" June, 1997, 387 , 654-655. see I” 
paragraph thereof)- 

The present invention is based on the discovery that a short signature motif present in 
the nuclear proteins is necessary and sufficient to mediate their binding to liganded NRs. 

5 According to one aspect of the present invention there is provided a method for 

identifying inhibitor compounds capable of reducing the interaction between: 

a) a first region which is a signature motif on a nuclear protein, and 

b) a second region wiiich is that part of a nuclear receptor which is capable of interacting 
with the nuclear protein through binding to the signature motif, 

10 wherein: 

the nuclear protein is a bridging factor that is responsible for the interaction between a 
liganded nuclear receptor and a transcription initiation complex involved in regulation of gene 
expression; 

the nuclear receptor is a transcription factor; 

1 5 the signature motif is a short sequence of amino acid residues which is the key structural 
element of a nuclear protein which binds to a liganded nuclear receptor as part of the process 
of the activation or repression of target genes; and 
in which the method comprises taking: 
i) the jX)tential inhibitor compound; 

20 11 ) the liganded nuclear receptor or a fragment thereof in which the fragment comprises 
the second region defined in this claim in b) above; 

m) a nuclear protein fragment comprising a signature motif of the nuclear protein; and 
iv) detecting the presence or absence of inhibition of the interaction between ii) and iii). 
The term nuclear protein” means the bridging factors (including coactivators) that are 
25 responsible for the interaction between a liganded nuclear receptor and the transcription 
initiation complex involved in regulation of gene expression (reviewed for steroid hormone 
receptors in Beato, M., Herrlich, P. & Schutz, G. Cell 83, 851-857 (1995)). The term 
bridging factor may include part of the transcription initiation complex itself. The term 
nuclear receptor” means the family of nuclear receptors such as described in Mangelsdorf, 

30 D.J., et al. Cell 83, 835-839 (1995). The term “signature motif’ means a short sequence of 
generally at least about 5 amino acids, preferably 4-10, more preferably 5-10 amino acid 
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residues which is the key structural element of a nuclear protein which binds to a liganded 
nuclear receptor as part of the process of the activation or repression of target genes. The term 
“liganded nuclear receptor” means an activated nuclear receptor, for example, with ligand 
bound thereto. The ligand can take various forms such as for example a hormone, a small 
5 molecule compound or a peptide. For example, in the case of some of the nuclear receptors 
(e.g. PPAR) there are non hormonal peptide ligands e.g. leukotrienes. Although nuclear 
receptors are generally activated through ligand binding, some receptors, such as for example 
orphan receptors, may be active without the need for ligand and/ or activated through non- 
ligand dependent pathways and these receptors are also within the scope of the invention as a 
1 0 less preferred embodiment. 

The term “fragment” means an incomplete part. Before the present invention, a skilled 
person could not have known which fragment or fragments of a nuclear protein could be taken 
to retain activity. Use of fragments compared with whole proteins is particularly 
advantageous in screening assays. In a preferred embodiment a prefened fragment size of a 
1 5 nuclear protein is 8-1 0 amino acids, such as, for example, shown in Figure 1 A herein. The 
liganded nuclear receptor is preferably in the form of a fragment. In general, fragments 
comprise at least 8 amino acids. 

Preferably the signature motif is represented by B'XXLL in which B> is any natural 
hydrophobic amino acid, L is leucine and X represents any natural amino acid. Values for X 
20 within the signature motif are independently selected i.e. X may be the same or different. 
Preferably B> is leucine or valine with leucine being most preferred. In some instances the 
preferred signature motif is further defined as B^B'XXLL wherein “B2” is a hydrophobie 
amino acid residue as defined for B'. A “natural hydrophobic amino acid” is defined as any 
one of isoleucine, leucine, methionine, phenylalanine, tryptophan, tyrosine or valine. 

25 Preferably the signature motif is in the conformation of a helix, preferably an amphipathic 
helix, and the leucine residues form a hydrophobic face thereof Preferably the signature 
motif is positioned within a molecule so that it is available at the surface thereof for 
interaction with proteins. Preferably values of X do not include Cys or Pro. Preferably at 
least one value of X is not a natural hydrophobic amino acid or proline. One value of X is 
30 preferably independently selected from Arg, Asn, Asp, Glu, Gin, His, Lys, Ser, Thr, Gly or 
Ala, more preferably one value of X is independently selected from Arg. Asn, Asp, Glu, Gin, 
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His or Lys. Without wishing to be bound by theoretical considerations, it is believed that 
preferred values of X favour the signature motif forming an amphipathic helix. 

After the first filing date of the present invention there was a simultaneous publication 
of the identification of LXXLL motifs by two groups (one of which included the inventors of 
5 the present invention), namely Heery el aL Nature, 12"’ June 1 997, 387, 733-736 & Torchia et 
al. Nature, 12"’ June 1997, 382, 677-684. 

Herein we show that the ability of a nuclear protein (SRC 1 ) to bind a nuclear receptor 
(liganded ER) and enhance its transcriptional activity is dependent upon the integrity within 
the nuclear protein of the signature motif (LXXLL; SEQ ID NO: 1), as well as key 
10 hydrophobic residues in the conserved helix (Helix 12) of NRs required for their ligand- 
induced activation function (AF-2) I4. The signature motif is also found in TIFl, TIF2, p300, 
RIP 140 and the TRIP proteins, and occurs within regions of these proteins known to be 
sufficient for interaction with NRs. Thus the LXXLL motif (SEQ ID NO: 1 ) is a signature 
sequence which facilitates the interaction of diverse proteins with nuclear receptors, and thus 
15 is a key part of a new family of nuclear proteins. 

A preferred nuclear protein is a coactivator, in particular the nuclear protein includes 
any one of RIP 140, SRC-1, TIF2, CBP, p300, TIFl, Tripl, Trip2, Trip3, Trip4, Trip5, Trip8 
or Trip9. Further preferred nuclear proteins include p/CIP, ARA70 & Trip230. 

In this specification a reference to a nuclear protein or nuclear receptor includes 
20 isoforms thereof unless stated or otherwise implicit from the context. An isoform is one of a 
family or collection of related proteins derived from a single gene. Thus isoforms may differ 
slightly in their amino acid sequences such as for example from differential splicing of exons 
following transcription. SRC 1 a is an example of an isoform of SRC 1 . Two isoforms of SRC- 
1, namely SRC la and SRCle, have been shown to contain differences in number of signature 
25 motifs and to be fimctionally distinct in so far as they appear to play different roles in ER- 
mediated transcription (Kalkhoven et al, 1998, EMBO Journal, 17, 232-243). 

Nuclear receptors are transcription factors. A preferred transcription factor comprises 
at least part of a conserved amphipathic a-helix, and especially preferred is retinoic acid 
receptor or a steroid hormone receptor. Preferred steroid hormone receptors are oestrogen 
30 receptor, progesterone receptor, androgen receptor and glucocorticoid receptor with oestrogen 
receptor being especially preferred. 
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Preferably the second region comprises at least part of a conserved amphipathic a- 
helix such as for example Helix 12 in the oestrogen receptor which is especially preferred. 

An especially preferred combination of nuclear receptor and nuclear protein is one in 
which the nuclear receptor is oestrogen receptor and the nuclear protein is selected from 
5 SRCl, TIF2, CBP and p300, with SRCl and especially SRC la being most preferred. 

A preferred method is in the form of a 2-hybrid assay system. Such assay systems are 
well known in the art; suitable references include Fields & Stemglanz (1994) TIG, August 
1994, 10, 286-292 and US patent 5283173. 

Any suitable assay design may be employed such as for example radioisotopic assay, 
10 scintillation proximity assay (reviewed by ND Cook. 1996, Drug Discovery Today, 1, 287- 
294) or fluorescence, particularly time resolved fluorescence, assay (reviewed by MV Rogers, 
1997, Drug Discovery Today, 2, 1 56-160). High-throughput screeening technologies have 
been reviewed by Houston & Banks in Current Opinion in Biotechnology 1997, 8, 734-740. 

In a preferred embodiment of the invention the potential inhibitor is in the form of a 
1 5 peptide library based on a signature motif. 

Encoded peptide libraries generally have a maximum of 20 possible amino acids at 
any one position. In practice, using current technology, it is difficult to screen libraries of 
more than 1 O’- 1 0® members, which means that it is difficult to randomise more than 6 
positions in a peptide. Hence, narrowing down the nuclear protein binding region to a signal 
20 motif is of great advantage if a peptide library approach is to be employed. 

A peptide library is a collection of peptides of varying sequences. There are in general 
two ways to generate peptide libraries (reviewed by Scott, 1992; Bimbaum and Mosbach, 
1992; Houghten, 1993;see also Abelson, 1996). The first approach is to generate libraries in 
which positive peptides are identified through the sequencing of the peptides themselves. 

25 Mixtures of peptides may be chemically synthesised in such a way that the peptides are linked 
to beads, so that each bead contains only one peptide. If a bead is identified which contains a 
positive peptide, the bead may be recovered and the peptide identified by chemical 
sequencing. This approach was first demonstrated using the ability of antibodies to identify 
specific six amino acid peptides from mixtures (Lam et al, 1991). The importance of using 
30 beads is that the identification event (in this case the antibody:peptide interaction) leads to the 
recovery of a bead which contains more peptide than that bound by the antibody itself In a 
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related approach mixtures of free peptides can be synthesised and screened in pools; by using 
a deconvolution process, positive peptides are identified (Houghten et al, 1991). 

■Hie second approach is to generate libraries in which positive peptides are identified 
by sequencing a molecule which is associated in some way with the peptide. Peptides and 
5 some other molecule, such as a nucleic acid, can be cosynthesised on beads, so that each 
nucleic acid “tags” the peptide found on the same bead. If a bead is identified which is 
positive, sequencing the nucleic acid will identify the peptide on this bead (Brenner and 
Lemer, 1992). Alternatively, peptide libraries can be generated using the gene expression 
machinery of living organisms. In this approach, it is not necessaiy to make a library of 
10 peptide molecules. Instead a library of DNA molecules is constructed, so that each molecule 
encodes a peptide with a different sequence. This encoded library must then be expressed in a 
suitable host organism in order that the peptide library be produced. The library is then 
screened. It is essential that the nucleic acid which encodes the library remain physically 
linked to the protein m some way, so that recovery, or identification, of the active peptides 
15 leads to recovery of the DNA which encodes these peptides. The sequences of the active 
peptides may then be deduced by sequencing of the DNA which encodes them. Several 
variations of this approach have been described. 

The mostly widely used version of this approach is to express peptides as part of the 
coat proteins of a virus such as Ml 3. The viruses can be screened by the ability of this coat 
20 protein to bind target proteins (such as antibodies (Devlin et al, 1990; Scott and Smith, 1990; 
Cwirla et al, 1 990) or receptors (the atrial natriuretic peptide receptor: Cunningham et al 
(1994) and the thrombopoietin receptor (Cwirla et al, 1 997). The approach may also be used 
to find protease inhibitors through their ability to bind to proteases (Roberts et al, 1 992; 
Markland et al, 1 996) as well as to find optimal substrates for proteases, such as stromelysin 
25 and matrilysin (Smith et al, 1995) and subtilisin (Matthews and Wells, 1993) 

An intracellular approach to the generation of peptides that recognise certain proteins 
is to use the yeast two-hybrid system. In the two-hybrid system, interacting proteins are fused 
to domains of transcription factors. If a proteiniprotein interaction occurs, then transcription of 
a reporter gene is stimulated (Fields and Song, 1 989). By making one component of the two 
30 hybrid system a peptide library , and selecting for cells in which reporter gene output occurs, it 
is possible to isolate peptides which bind to a target protein, this approach was used to 
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identify peptides which bound to the retinoblastoma protein (Yang ei al, 1 995). In a similar 
approach. Colas el al (1996) expressed the peptide library as a loop in the surface of the E. 
coli TrxA protein and isolated peptides which bound to cyclin-dependent kinase 2 (Cdk2). 
According to another aspect of the present invention there is provided a method of 
5 reducing the interaction between 

a) a first region which is a signature motif on a nuclear protein, and 

b) a second region which is that part of a nuclear receptor which is capable of interacting 
with the nuclear protein through binding to the signature motif, 

in which the method comprises adding an inhibitor in the presence of the nuclear receptor and 
10 the nuclear protein, the inhibitor being characterised in that it reduces the interaction between 
the first region of the nuclear protein and the second region of the nuclear receptor. 

According to another aspect of the present invention there is provided a novel inhibitor 
as described above. Preferably the inhibitor is a peptide, more preferably a peptide 
comprising the signature motif defined above, and more preferably the peptide has less than 
15 15 amino acid residues. Especially preferred inhibitors are any one of the following peptides; 
PQAQQKSLLQQLLT(SEQIDNO: 2), KLVQLLTTT (SEQ ID NO: 3), ILHRLLQE (SEQ 
ID NO: 4), or LLQQLLTE (SEQ ID NO: 5). Peptides may be prepared using conventional 
techniques for example using solid phase synthesis and Fmoc chemistry. These peptides are 
expected to be useful in the treatment of oestrogen responsive tumours. Inhibitors of the 
20 invention are expected to be useful in the treatment of any disease mediated through 
interaction between a signature motif on a nuclear protein and a nuclear receptor. For 
example, suitable inhibitors are expected to be useful in treatment of cancer or inflammation. 

A novel inhibitor, for example, could be an antibody against a signature motif or a 
novel small molecule which binds to the signature motif or its complementary binding target 
25 (nuclear receptor second region) such that normal biological activity is prevented. Examples 
of small molecules include but are not limited to small peptides or peptide-like molecules. 

Whilst the signature motif is demonstrated herein to apply across nuclear proteins as a 
class it is expected that different nuclear receptors display both coactivator and signature 
motif preferences that contribute to specificity of hormonal response (Ding et al, 1998, 

30 Molecular Endocrinology, 12. 302-313) which in turn points to selective pharmaceutical 
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intervention opportunities. Figures 1 A and 5 below also indicate that individual motifs may 
differ in the strength to which they bind to nuclear receptors. 

Note the NR box of Le Douann et ol (discussed above) did not disclose a signature 
motif within the meaning of the present invention because, for example, the NR box within 
5 the meaning of Le Douarin would be present in at most only 4 of the 39 signature motifs 
identified by the present invention in Figures 3A & 4 (see below). Furthermore, Le Douarin 
et al did not even suggest inhibitors of nuclear receptor - nuclear protein interaction. 

The term antibodies is meant to include polyclonal antibodies, monoclonal 
antibodies, and the various types of antibody constructs such as for example F(ab’) 2 , Fab and 
10 single chain Fv. Antibodies are defined to be specifically binding if they bind with a Kg of 
greater than or equal to about 10^ M '. Affinity of binding can be determined using 
conventional techniques, for example those described by Scatchard et al., Ann. N. Y. Acad 
Sci., 51: 660(1949). 

Polyclonal antibodies can be readily generated from a variety of sources, for example, 
1 5 horses, cows, goats, sheep, dogs, chickens, rabbits, mice or rats, using procedures that are 
well-known in the art. In general, immunogen is administered to the host animal typically 
through parenteral injection. The immunogenicity may be enhanced through the use of an 
adjuvant, for example, Freund’s complete or incomplete adjuvant. Following booster 
immunizations, small samples of serum are collected and tested for reactivity. Examples of 
20 various assays useful for such determination include those described in; Antibodies: A 

Laboratory Manual, Harlow and Lane (eds.). Cold Spring Harbor Laboratory Press, 1988; as 
well as procedures such as countercurrent immuno-electrophoresis (CIEP), 
radioimmunoassay, radioimmunoprecipitation, enzyme-linked immuno-sorbent assays 
(ELISA), dot blot assays, and sand\vich assays, see U.S. Patent Nos. 4,376,1 10 and 4,486,530. 
25 Monoclonal antibodies may be readily prepared using well-known procedures, see for 

example, the procedures described in U.S. Patent Nos. 4,902,614, 4,543,439 and 4,41 1 ,993; 
Monoclonal Antibodies, Hybridomas: A New Dimension in Biological Analyse.^, Plenum 
Press, Kennett, McKeam, and Bechtol (eds.), (1980). 

Monoclonal antibodies can be produced using alternative techniques, such as those 
30 described by Alting-Mees et al., “Monoclonal Antibody Expression Libraries: A Rapid 
Alternative to Hybridomas”, Strategies in Molecular Biology 3; 1-9(1 990) which is 
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incorporated herein by reference. Similarly, binding partners can be constructed using 
recombinant DNA techniques to incorporate the variable regions of a gene that encodes a 
specific binding antibody. Such a technique is described in Larrick ei al.^ Biotechnology^ 7: 
394(1989). 

5 According to a further feature of the invention there is provided a pharmaceutical 

composition which comprises a novel inhibitor of the invention, or a pharmaceutically- 
acceptable salt thereof, in association with a pharmaceutically-acceptable diluent or carrier. 

The composition may be in a form suitable for oral use, for example a tablet, capsule, 
aqueous or oily solution, suspension or emulsion; for topical use, for example a cream, 

1 0 ointment, gel or aqueous or oily solution or suspension; for nasal use, for example a snuff, 
nasal spray or nasal drops; for vaginal or rectal use, for example a suppository; for 
administration by inhalation, for example as a finely divided powder such as a dry powder, a 
microcrystalline form or a liquid aerosol; for sub-lingual or buccal use, for example a tablet or 
capsule; or for parenteral use (including intravenous, subcutaneous, intramuscular, 

1 5 intravascular or infusion), for example a sterile aqueous or oily solution or suspension. In 
general the above compositions may be prepared in a conventional manner using conventional 
excipients. For peptidic inhibitors, parenteral compositions are preferred. 

The amount of active ingredient that is combined with one or more excipients to 
produce a single dosage form will necessarily vary depending upon the host treated and the 
20 particular route of administration. For example, a formulation intended for oral 

administration to humans will generally contain, for example, from 0.5 mg to 2 g of active 
agent compounded with an appropriate and convenient amount of excipients which may vary 
from about 5 to about 98 percent by weight of the total composition. Dosage unit forms will 
generally contain about 1 mg to about 500 mg of an active ingredient. 

25 According to another aspect of the present invention there is provided a method of 

mapping nuclear receptor interaction domains in nuclear proteins in which the method 
comprises analysis of the sequence of a nuclear protein for the presence of signature motifs as 
defined herein in order to identify an interaction domain or a potential interaction domain. 
Preferably the analysis further comprises analysis of any potential interaction domains 
30 identified thereby for a-helicity and/or surface accessibility. 
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The invention is illustrated by the non-limiting Examples below in which, unless 
stated otherwise; temperatures are expressed in degrees Celsius; and peptide sequences are 
listed N-terminus to C-terminus. 

Figures la/lb show the interaction of LXXLL motifs derived from coactivators 
5 with the ER. 

Figure la: Yeast two hybrid interactions of LXXLL motifs, derived from the proteins 
RIP 1 40, SRC 1 a and CBP with the LBDs of wild type or mutant ER. The sequences of the 
LXXLL motifs in the DNA binding domain (DBD) fusion proteins are indicated. DBD- 
LXXLL proteins were coexpressed with AAD-ER or AAD-ER Mut, which consist of an 
1 0 acidic activation domain (AAD) fused to the LBD of the wild-type ER, or a transcriptionally 
defective ER mutant, respectively. Reporter activities were determined in the presence or 
absence of 10-7M 1 7-(3-estradiol (E2) and expressed as units of P-galactosidase activity. 
Sequences listed in Figure la, from top to bottom, are listed as SEQ ID NO: 6-23 
respectively. 

15 Figure lb. Effects of mutations in the RIP140 LXXLL motif located at amino acids 935-943 
on binding of AAD-ER. Conserved leucine residues are boxed and mutated residues are 
circled. The reporter activity was determined in the presence (black bars) or absence (white 
bars) of 1 0-7M E2. Sequences listed in Figure lb, from top to bottom, are listed as SEQ ID 
NO: 24-32 respectively. 

20 Figures 2a/2b/2c show that LXXLL motifs are required for binding of SRCl to 

the ER LBD in vitro and for the ability of SRCl to enhance ER activity in vivo. 

Figure 2a; Wild type (SRC la) and mutant (SRC la-M 1234) SRCl proteins are shown 
schematically. The black bars represent the approximate locations of the LXXLL binding 
motifs in the linear SRC la sequence and the shaded circles indicate the mutation of LXXLL 
25 binding motifs by replacement of conserved leucine residues with alanines (see Methods). 
Binding of wild type SRC la or SRC la-M 1234 mutant to glutathione S transferase (GST) 
alone, to the ligand binding domain (aa 313-599) of ER (GST-AF2), or the SRCl binding 
domain (aa 2058-2163) of CBP (GST-CBP) in the presence (+) and absence (-) of 10-6M E2. 
The signals obtained with 1 0% of the input of P^S] -labelled wild type and mutant SRCl 
30 proteins are shown. 
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Figure 2b: The ability of increasing amounts of the peptides P-1 (SEQ ID NO: 2) and P-2 
(SEQ ID NO: 72) to compete against the binding of wild type SRC la to GST-AF2 in the 
presence of ligand is shown. The sequences of the P-1 and P-2 peptides are given at the foot 
of Fig 2b, and the conserved leucines and alanine substitutions are boxed. 

5 Figure 2c: Wild type but not mutant SRCle Ml 23 potentiates activation by ER of the 
reporter gene 2ERE-pS2-CAT in transiently transfected Hela cells. Reporter activities 
obtained from extracts of transfected cells grown in the absence (white columns) or presence 
(black columns) of ligand (10-8M E2). The amounts of ER, SRCl-wt and and SRCl-mut 
expression plasmids used in the transfections are indicated below the graph. The activities 
1 0 shown are averaged from duplicates. 

Figures 3a/3b & 4 show that the LXXLL sequence is a signature motif in proteins 
that bind the LBDs of NRs. 

Figure 3a: Alignment of LXXLL motif sequences present in human RIP140 i®, human 
SRC la, mouse TIF2 6, mouse CBP 23,24 , p30 0 33 ^ mouse TIFl > i and human TRIP proteins 
15 12. The conserved leucines are boxed and the amino acid numbers are given for each motif. 
Sequences listed in Figure 3 A, from top to bottom, are listed as SEQ ID NO: 33-61 
respectively. 

Figure 3b: Schematic representation of the incidence of LXXLL motifs (black bars) in the 
sequences of proteins which bind NRs. The amino acid boundaries of the known NR binding 
20 sites are also shown. 

Figure 4: Alignment of LXXLL motif sequences present in CBP, P300, p/CIP, ARA70 & 
TRIP 230. The conserved leucines are boxed and the amino acid numbers are given for each 
motif. Sequences listed in Figure 4, from top to bottom, are listed as SEQ ID NO: 62-71 
respectively. 

25 Figure 5 shows that the precise nature of the LXXLL signature motif affects the 

strength of the interaction between SRC-1 a and the LBDs of ERa, £R|3 and GR. 

Figure 5a: The ERa LBD binds to the 4 signature motifs of SRC-1 a with differing affinities in 
yeast 2-hybrid assays (SRC- la motif 1-4 = SEQ ID 73-76). The order (decreasing affinity) is 
SRC-la motif 2> SRC-la motif 4 > SRC-la motif 1> SRC-la motif 3. 




wo 98/49561 



PCT/GB98/01238 



- 12 - 

Figure 5b: The ERp LBD binds to the signature motifs of SRC- la with the same relative 
affinities seen with ERa. The order (decreasing affinity) is SRC- la motif 2 > SRC-1 a motif 
4> SRC-1 a motif 1 > SRC-la motif 3. 

Figure 5c: The GR LBD binds to the signature motifs of SRC- 1 with differing affinities and 
5 the rank order of affinities differs to those seen with ERa and ERp. The order (decreasing 
affinity) is SRC-a motif 4 > SRC-la motif 1 = SRC-la motif 2 = SRC-la motif 3. 

In Figure 5 the y-axis units represent relative /3-galactosidase activity. In Figure 5 the x-axis 
motif numbering only indicates part of the actual sequence used which was as follows: 627- 
640, 684-696, 743-755 and 1428-1441. 

1 0 The following abbreviations are used. 



AAD 


acidic activation domain 


AF 


activator function 


DBD 


DNA binding domain 


E2 


1 7-P-estradiol 


ER 


estrogen receptor 


GR 


glucocorticoid receptor 


GST 


glutathione S transferase 


LBD 


ligand binding domain 


NR 


nuclear receptor 


PCR 


polymerase chain reaction 


RAR 


retinoic acid receptor 


RIP 


receptor interacting protein 


RXR 


retinoid X receptor 


SRC 


steroid receptor coactivator 


TIF 


transcriptional intermediary factor 


TRp 


thyroid hormone receptor p 



Standard amino acid abbreviations have been used. 
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Alanine 


Ala 


A 


Arginine 


Arg 


R 


Asparagine 


Asn 


N 


Aspartic Acid 


Asp 


D 


Cysteine 


Cys 


C 


Glutamic Acid 


Glu 


E 


Glutamine 


Gin 


Q 


Glycine 


Gly 


G 


Histidine 


His 


H 


Isoleucine 


He 


I 


Leucine 


Leu 


L 


Lysine 


Lys 


K 


Methionine 


Met 


M 


Phenylalanine 


Phe 


F 


Proline 


Pro 


P 


Serine 


Ser 


S 


Threonine 


Thr 


T 


Tryptophan 


Trp 


W 


Tyrosine 


Tyr 


Y 


Valine 


Val 


V 


Any amino acid 


Xaa 


X 



Point mutations will be referred to as follows: natural amino acid (using the 1 letter 
nomenclature) , position, new amino acid. For example “L636A” means that at position 636 a 
5 leucine (L) has been changed to alanine (A). Multiple mutations will be shown between 
square brackets. 
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Example 1 

Mapping of interaction sites between Nuclear Receptor and Nuclear Protein 

It has been previously demonstrated that the 140 kDa receptor interacting protein 
(RIP 140) bound directly to NRs through at least two distinct sites located at the N- and C- 
5 termini of the protein 2i. To map these interaction sites in more detail, we examined a series 
of twenty different PCR-generated fragments of RIP 140 coding sequence fused in frame with 
a heterologous DBD, for interaction with NRs in a two hybrid system. Remarkably, although 
the different constructs spanned the entire 1 158 amino acids of RIP140 sequence, all but two 
displayed ligand-dependent interaction with ER, including five non-overlapping RIP 140 

10 sequences. By comparison of the sequences of the shortest interacting fragments we 

identified a short motif (LXXLL) common to all interacting fragments. In total, nine copies 
of the motif were identified in the RIP140 sequence, but the motif was absent in fragments 
showing no binding activity in our experiments. 

To determine if these short sequences were sufficient to bind to NRs, we constructed a 
15 series of proteins consisting of a DBD fused to eight to ten amino acids incorporating one 
copy of each of the nine LXXLL motifs. As shown in Fig. la, each of the nine motifs present 
in RIP 140 displayed strong ligand-dependent interaction with the LBD of the ER whereas the 
DBD alone showed no ability to bind. (Note that of the 10 motifs listed for RIP140, the lO* is 
a repeat of the O"* locus). Comparable results were obtained with the LBD of RAR (data not 
20 shown). Mutation of hydrophobic residues within H12 abolish AF2 activity and prevent the 
recruitment of RIP140 lo, TIFl 1 1, TIF2 6, SUGl '3 and SRCl . Similarly, mutation of H12 
residues M543 and L544 in the ER abolished the ligand-dependent interaction of all nine 
LXXLL motifs with ER (Fig. la). Taking these results together, we conclude that a short 
conserved motif comprised within as littie as eight amino acids is sufficient to bind to 
25 transcriptionally active NRs. This discovery that such a relatively small motif can affect the 
interaction between two relatively large molecules is unprecedented in this field. 

SecondaiT stmcture analysis using the Phd program 22 revealed that each of the nine 
copies of this motif in RIP 140 occurred within a region predicted to be a-helical in nature, in 
which the conserved leucines would form a hydrophobic face. 



30 
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Example 2 

Mutational Analysis of a Signature Motif 

To determine the sequence constraints required to observe a functional interaction, we 
carried out a partial mutational analysis of one of the RIP140 motifs (amino acids 935-943; 

5 Fig. lb). While western blot analysis showed no significant variation in the expression of the 
wild type and mutant fusion proteins (data not shown), mutation of valine 935 to alanine 
resulted in approximately ten fold reduction in the reporter activity in the presence of ligand 
which, when coupled with the observation that the first amino acid is hydrophobic in seven of 
the nine LXXLL motifs in RIP 140, may indicate a preference for a hydrophobic residue at 
10 this position. Strikingly, mutation of any one of the three conserved leucine residues L936, 
L939 or L940 to alanine resulted in a complete loss of binding to the LBD of ER (Fig. lb) and 
RAR (data not shown), emphasising their importance in mediating the interaction with NRs. 

In contrast, mutation to alanine of L941 (which is not conserved among the motifs; see Fig. 
3a), had no effect on the ability of this sequence to bind to the ER LBD. Replacement of a 
1 5 conserved leucine residue with a valine was tolerated at L936, but not at L939 or L940 
indicating that hydrophobic character alone is not sufficient to maintain an interaction with 
ER (Fig. lb). The amino acids K937, Q938, S942 and E943 were not subjected to 
mutagenesis as they are not conserved among the motifs we have identified (see Fig. 3a). 

20 Example 3 

Analysis of Signature Motifs in Nuclear Proteins 

The steroid receptor coactivator SRCl, which stimulates ligand-dependent 
transcriptional activity, was originally identified as a partial cDNA encoding a protein capable 
of interacting with the progesterone receptor by means of a 1 96 amino acid C-terminal region 
25 3. We noted that the eight most C-terminal amino acids fit the LXXLL consensus, and indeed 
this sequence (DBD-SRCla 1434-1441) displayed strong ligand-induced binding to ER, but 
not the ER H12 mutant (Fig. la). Subsequent studies have identified full-length SRCl 
(SRC la) from mouse (1459 amino acids) '•■5 and human (1441 amino acids) tissues. Both 
murine and human SRC la proteins interact with multiple NRs, and contain an additional 
30 interaction region between residues 569 to 789 5 and 570-780, respectively. Three copies of 
the LXXLL motif were identified in this central interaction domain of human SRC la (see Fig. 
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3a & 3b), each of which displayed ligand-dependent binding to both ER (Fig. la) and RAR 
(data not shown) in the two hybrid assay, but not the ER H12 mutant. Interestingly, the 
sequences and relative positions of the three motifs in the central domain of SRC la are 
conserved in the related coactivator protein Transcriptional Intermediary Factor 2 (TIF2) (Fig. 

5 3a + b), and correspond to the region of TIF2 known to bind to NRs (>. However, unlike 
SRC la, TIF2 appears to lack a motif at its C-terminus. In addition, we noted that SRCl 
contains three other sequences matching the LXXLL consensus. Although the motif at 
residues 45-53 is predicted by the Phd program to be a-helical and lies within the basic helix- 
loop-helix domain at the N-terminus of SRC la 5, it showed only a very weak (6-fold) 

10 interaction with liganded ER (Fig. la) or RAR (not shown) in the yeast two hybrid assay. 

This is consistent with the observed absence of strong NR-binding activity associated with the 
N-terminus of SRC 1 . The other two motifs within residues 111-118 and 91 2-920 both 
contain proline residues and are unlikely to adopt a-helical structure according to the Phd 
program. Indeed, these sequences showed no detectable interaction with NRs in our binding 
15 assays (Fig. la), which strongly suggests a preference for appropriate secondary structure for 
binding of LXXLL sequences to NRs. 

Recent reports have indicated that CBP/p300 proteins, which were originally 
identified as coactivators for CREB 23,24^ are coactivators for many transcription factors 
including NRs 4.5.8, 9 and may serve as integrators of several signalling pathways 4. CBP was 
20 shown to bind directly to NRs via its N-terminal 101 amino acids 4 , with a possible RXR- 
specific binding site between residues 356-495 ». Our analysis showed that the CBP sequence 
harbours copies of the LXXLL motif within positions 68-78, and 356-364, which are 
conserved in the p300 sequence (amino acids 80-90 and 34 1 -35 1 ; Fig 3a). Indeed, when 
tested in the two hybrid assay, the N-terminus of CBP (amino acids 1-101 ; data shown) and 
25 the LXXLL motif at residues 68-75 of the CBP sequence (Fig. la) displayed ligand-dependent 
binding to ER, but not the transciptionally defective ER mutant (Fig. 1 a). 



Example 4 



The Binding Of Coactivator Proteins To NRs Is Dependent On Signature Motifs 
To demonstrate that the binding of coactivator proteins to NRs is dependent on 
LXXLL motifs, we introduced alanine substitutions in SRC la at the conserved leucine 
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couplets at residues [L636A, L637A, L693A, L694A, L752A, L753A, L1438A, L1439A] 
thus effectively creating a mutant protein (SRCla-M1234) in which all the four functional 
binding motifs were disabled. We then compared the ability of in vitro translated SRC la and 
SRC la-M 1234 to bind to the ligand binding domain of the mouse estrogen receptor fused to 
5 glutathione-S-transferase (GST-AF2) in GST pulldown experiments (Maniatis et o/.,(1982) 
Molecular Cloning. A Laboratory Manual. Cold Spring Harbour Laboratory, Cold Spring 
Harbour, New York.). As shown in Fig. 2a, while wild type SRC la protein displayed ligand- 
dependent binding to GST-AF2, SRC la-M 1234 failed to bind to GST-AF2 either in the 
presence or absence of ligand. To confirm that the mutations did not induce gross structural 
10 disruption of the SRC la-M 1234, we compared the ability of the in vitro translated proteins to 
interact with amino acids 2058-2163 of CBP, which was previously defined as the SRCl 
binding domain 4. Both proteins retained strong binding to GST-CBP (Fig. 2a) indicating that 
this SRC 1 function remained intact in both wild type and mutant proteins. In addition we 
showed that the binding of wild type SRCl a to GST-AF2 was competed by increasing 
15 concentrations of a short peptide (PI) corresponding to the motif at the C-terminus of SRCla 
(Fig. 2b). In contrast, a similar peptide (P2) in which the LXXLL motif was mutated, or 
peptides unrelated to the LXXLL motif (data not shown), did not compete the binding of 
SRCla to GST-AF2 (Fig. 2b). 

Finally, to demonstrate that LXXLL motifs are necessary for the function of SRCl in 
20 vivo, we compared the abilities of wild type SRCl and a mutant protein in which all LXXLL 
motifs were disabled to enhance the activity of mouse ER in transient transfection 
experiments. As shown in Fig. 2c, wild type SRCl enhanced the activity of ER in a 
concentration-dependent manner. In contrast, the SRC 1 mutant, which was unable to bind ER 
(Fig. 2a) had no stimulatory effect, but reduced ER activity by up to 50% at the highest 
25 concentration (Fig. 2c). This apparent dominant negative property of the mutant SRCl is 
likely due to its ability to maintain interactions with CBP while failing to interact with NRs 
(Fig. 2a). This result is of interest given the recent evidence that SRCl and CBP/p300 may 
exist as a complex in vivo 9, and that CBP also has NR binding activity 4, 8^ as our data 
suggest that the interactions between NRs and CBP are insufficient to compensate for the 
30 inability of the SRCla mutant protein to bind NRs, at least under these conditions. It remains 
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to be determined whether NRs are engaged simultaneously by pi 60 and p300 proteins 
functioning independently or as a complex. 

Examination of the sequences of other proteins known to bind to NRs revealed them to 
contain one or more copies of the LXXLL motif. TIFl contains a single motif (residues 722- 
5 732) within the minimal region known to be required for its interaction with NRs ' i. 25 . The 
truncated proteins TRIPs2-5, TRIP8 and TRIP9, which were isolated in a two hybrid screen 
for TR-interacting proteins '2, each contain at least one copy of the LXXLL motif (Fig. 3a), 
whereas the motif was absent in TRIPs whose interaction with TR was ligand-independent. 

An alignment of a selection of these sequences is shown in Fig. 3a, while Fig. 3b shows the 
10 incidence of motifs in the sequences of RIP140, SRCla, TIFl, TIF2, CBP and p300, and the 
boundaries of known receptor interaction domains in these proteins. Interestingly, motifs 
were also identified in several other proteins for which evidence exists of interaction with 
NRs, including Ara70 26 , SW13 27 , and the RelA (p65) subunit of NFk-B 28 , although the 
receptor interaction domains in these proteins have not been mapped. The ability of other 
15 proteins containing LXXLL motifs to bind to NRs will depend on their subcellular 

localisation, as well as the a-helicity and surface accessibility of the motifs. While it is clear 
that the conserved leucine residues are essential for the function of the motif, other amino 

acids may also be important given the degree of sequence conservation of equivalent motifs in 
SRC1/TIF2 or CBP/p300. 

20 As many NR binding proteins contain multiple copies of the LXXLL motif it remains 

to be established whether this facilitates the simultaneous contact of individual partners in 
homo- and heterodimers of NRs, or whether it serves to provide alternative interaction 
surfaces to accomodate conformational changes imposed by the binding of NRs to different 
response elements. The systematic mutation of LXXLL motifs in coactivators such as SRCl 
25 and CBP may allow us to decouple crosstalk or synergy between different signal transduction 

pathways, and thus provide a better understanding of their proposed roles as coactivators and 
integrators. 
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Example 5 

Two Hybrid Interaction Assays 

The yeast reporter strain used for all two hybrid assays was W303-1B (HMLa MATa 
HMRahis3-ll, 15 trpl-1 ade2-l canl-100 leu2-3, 11, ura3) carrying the plasmid pRLA2 1 - 
5 U3ERE which contains a lacZ reporter gene driven by three estrogen response elements 
(EREs) 29. The plasmids pBLl and pASV3 which express the human ER DNA binding 
domain (DBD) and the VP 16 acidic activation domain (AAD) respectively 30^ were used to 
generate DBD or AAD fusion proteins for two hybrid interaction analyses. DBD-LXXLL 
motif fusion proteins were generated by ligation of phosphorylated, annealed oligonucleotide 
10 pairs into the pBLl vector. AAD-ER was constructed by cloning a PCR fragment encoding 
amino acids 282-595 of the human ER into pASV3. AAD-ER Mut was constructed in a 
similar fashion except that the amino acids M543 and L544 or ER were mutated to alanines 
by recombinant PCR. All fusion constructs were fully sequenced. Transformants containing 
the desired plasmids were obtained by selection for the appropriate plasmid markers and were 
15 grown to late log phase in 15 ml of selective medium (yeast nitrogen base containing 1% 
glucose and appropriate supplements) in the presence or absence of lO-’M 17-P-estradiol 
(E2). 

The expression of DBD- and AAD- fusion proteins in yeast cell-free extracts was 
verified by immunodetection using a monoclonal antibody recognising the human ER (a gift 
20 from P . Chambon, Strasbourg). The antibody recognises the “F” region of the LBD in the 
human ER, and also the “F” region tag at the N-termini of the DBD fusion proteins 30 . Equal 
amounts of protein were electrophoresed on polyacrylamide gels and transferred to 
nitrocellulose for western blotting. The preparation of cell-free extracts by the glass bead 
method and the measurement of P-galactosidase activity in the extracts were performed as 
25 previously described 29 . Two hybrid experiments were repeated several times, and the data 
shown in Figs. 1 a and 1 b represent reporter activities as measured in a single representative 
experiment. The p-galactosidase activities are expressed as nmoles/minute/pg protein. 
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Examole 6 

In Vitro Binding And Peptide Inhibition Assays 

GST-AF2 consists of the ligand binding domain of the mouse ER (amino acids 3 1 3- 
599) fused to glutathione-S-transferase and has been described previously 3i. GST-CBP 
5 consists of GST fused to the SRC 1 -binding domain of CBP and was constructed by cloning a 
PCR fragment encoding residues 2058-2163 of mouse CBP into the vector pGEX2TK 
(Pharmacia). Human SRC la and SRCle cDNAs were isolated form a human B cell cDNA 
library and cloned into a modified version of the expression vector pSG5. SRC la Ml 234 and 
SRCleM123 were constructed by recombinant PCR to introduce the mutations [L636A, 

10 L637A, L693A, L694A, L752A, L753A, L1438A, L1439A] or [L636A, L637A, L693A, 
L694A, L752A, L753A] respectively. All SRCl constructs were fully sequenced. GST- 
SEPHAROSETM beads were loaded with GST alone or GST-fiision proteins prepared from 
bacterial cell-free extracts. [35S]-labelled SRCl proteins were generated by in vitro 
translation and tested for interaction with GST proteins in the presence or absence of lO-^M 
15 estradiol (E2) as previously described 2i. Binding was carried out for 3 hours at 4° with 
gentle mixing in NETN buffer (100 mM NaCl, 1 mM EDTA, 0.5 % NP-40, 20 mM Tris HCl, 
pH 8.0) containing protease inhibitors in a final volume of 1 ml. Peptides P-1 and P-2 were 
dissolved in water at a concentration of 4mg/ml and added individually to GST-binding 
reactions immediately before the addition of ligand. The increasing amounts of peptide added 
20 in the competition experiments shown corresponded to 2.5, 5, 12.5 and 25 pM. 

Using analogous methodology, peptides KLVQLLTTT (SEQ ID NO: 3), ILHRLLQE 
(SEQ ID NO: 4) and LLQQLLTE (SEQ ID NO: 5) can be shown to be inhibitors. 



Example 7 

25 Transient Reporter Assays 

Hela cells were transfected with Ipg of reporter 2ERE-pS2-CAT 32, 150 ng of p- 
galactosidase expression plasmid (internal control), 10 ng of ER expression plasmid and 50 or 
200 ng of SRCl expression plasmids or empty vector per well (in duplicate) using 24-well 
plates. Transfected cells were incubated overnight in Dulbecco’s modified Eagle’s medium 
30 without phenol red and containing 10% charcoal-treated FBS, and washed in fresh medium 
before addition of ligand (10-8M E2) or vehicle. After 40 hrs, cells were harvested and 
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extracts analysed for CAT and P-galactosidase activities *4, 21 . p-galactosidase activities were 
used to correct for differences in transfection efficiency. 

Example 8 

5 Pharmaceutical Composition 

The following illustrates a representative pharmaceutical dosage form containing a 
peptide inhibitor and which may be used for therapy. 

Injectable solution 

1 0 A sterile aqueous solution, for injection, containing per ml of solution: 



Peptide P-1 5.0mg 

Sodium acetate trihydrate 6.8mg 

Sodium chloride 7.2mg 

Tween 20 O.OSmg 



A typical dose of peptide for adult humans is 30mg. 

15 Example 9 

The strength of the interaction of coactivator signature motifs to NRs varies depending 
on the precise motif sequence 

The interaction between the ERa LBD and a range of different LXXLL motif fusion 
proteins appeared to yield different reporter gene activity indicating that the ERa LBD 
20 interacted with the LXXLL motifs with differing affinities (see Example 5), therefore the 
strength of these interactions was investigated more closely. The motifs of SRC- la were 
tested for their relative interaction specificities with the glucocorticoid receptor and estrogen 
receptor isoforms a and p in a yeast two hybrid assay. 

The SRC- la motifs 1-4 (SEQ ID 73-76) were expressed as fusion proteins with the 
25 LexA DNA binding domain. Motif fusion proteins were generated by ligation of annealed 
oligonucleotide pairs in frame into the ADH promoter driven LexA DBD vector YCpl4- 
ADH-LexA. YCpl4ADHl-LexA is a plasmid from which LexA is expressed under control of 
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the 5. cerevisiae ADHl promoter. The backbone of this vector is the plasmid RS3 14 (Sikorski 
and Hieter, 1 989). Between the Sac I and Kpn I restriction enzyme sites in the jxjlylinker of 
this vector we have placed an expression cassette comprising the promoter of the S. cerevisiae 
ADHl gene, the coding region of the E. coli LexA gene and the transcription termination 
5 region region of the S. cerevisiae ADHl gene. The promoter region of ADHl consists of a 
1.4kb Bam HI- Hind III fragment from the plasmid pADNS (Colicelli et al, 1989). Following 
the Hind III site is the coding region (amino acids 1-202) of £. coli LexA (Horii et al, 1981; 
Miki et al, 1981; Markham e/ o/, 1 98 1 ). The E. coli LexA fragment was obtained as a Hind 
III- Pst I fragment from the plasmid pBXLl (Martin et al, 1990). This sequence corresponds 
10 to nucleotides 95-710 of Genbank entry gl46607. Following this sequence is a region which 
encodes a polylinker (SEQ ID 77). 

GAATTCCTGCAGCCCGGGGTCGACACTAGTTAACTAGCGGCCGC 

This polylinker adds the amino acids EFLQPGVDTS (SEQ ID NO: 80) to the carboxy 
terminus of LexA. The Not I site at the end of this linker is linked to a DNA fragment which 
1 5 includes the transcription terminator region of the S. cerevisiae ADHl gene. This fragment is 
the 0.6kb Not I-Bam HI from the plasmid pADNS (Colicelli et al, 1989). 

The NR LBDs were expressed as fusions with the Gal4 transcriptional AD. The LBD 
fusions were constructed by cloning a PCR fragment encoding the amino acids corresponding 
to the LBDs into YCpl5Gall-rl l.YCpl5Gall-rl 1 is a plasmid from which fusions of the 
20 activation region of the S. cerevisiae Gal4 protein may be expressed under the control of the 
S. cerevisiae GALl promoter. The backbone of this vector is the plasmid RS3 15 (Sikorski and 
Hieter, 1989). Between the Sac I and Kpn I restriction enzyme sites in the polylinker of this 
vector we have placed an expression cassette comprising the promoter of the S. cerevisiae 
GALl gene, the coding region of the fusion protein and the transcriptional termination region 
25 of the S. cerevisiae ADHl gene. The GALl promoter (Johnson and Davis, 1984; Yocum et al, 

1 984; West et al, 1 984) was obtained by amplification of a S. cerevisiae genomic fragment by 
the polymerase chain reaction (PCR). This fragment corresponds to nucleotides 177 to 809 of 
Genbank database entry g 1 7 1 546. This is followed by the sequence 

AAGCTTCCACCATGGTGCCAAAGAAGAAACGTAAAGTT (SEQ ID 78). 

sequence provides a translation initiation codon and a sequence which encodes 
the amino acids MVPKKKRKV (SEQ ID NO: 81). The last seven residues of this peptide 
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correspond to a region identified as a nuclear localisation signal in the SV40 T antigen (amino 
acids 126-132 of Genbank entry g3 10678; Piers ei al, 1978; Reddy et al, 1978). This region is 
linked to sequence encoding the region II transcription activation domain (amino acids 768- 
881) of Gal4, as defined by Ma and Ptashne (1987). The sequence was isolated by PCR from 
5 plasmid pBXGalll (P. Broad unpublished) which is a mammalian version of the yeast 
expression vector pMA236 (Ma and Ptashne, 1987) and corresponds to nucleotides 2744 to 
3085 of Genbank entry gl71557 (Laughon and Gesteland (1984). This sequence is followed 
by the polylinker 

TCTAGACTGCAGACTAGTAGATCTCCCGGGGCGGCCGC (SEQ ID 79). 

10 All fusion constructs were fully sequenced. The vectors YCpl4-ADH-lexA and 

YCpl5Gall-rl 1 replicate as a single copy plasmids in yeast(ARS-CEN) and have TRPl and 
LEU2 markers respectively. 

The S. cerevisiae strain MEY132 (M. Egerton, Zeneca Pharmaceuticals, unpublished, 
genotype (Ma/a leu2-3,112 ura3-52 trpl his4 rwe7) was employed as a host strain. A 
1 5 reporter gene consisting of the E. coli lacZ (p-galactosidase) gene under the control of a 

promoter containing two binding sites for the lexA protein was integrated at the ura3 locus of 
this strain. The reporter gene plasmid, JP159-lexRE was constructed using plasmid JP159 as a 
backbone (J. Pearlberg, Ph.D. Thesis (1994) Harvard University). This is a shuttle vector 
which contains the S. cerevisiae URA3 as a marker. The plasmid contains the the E.coli p- 
20 galactosidase gene under the control of a minimal promoter containing the TATA box and 
transcription initiation site from the S. cerevisiae GALl promoter (Dixon et al 1997). 
Upstream of this promoter is a terminator from the GALl] gene. Between the Xba I and Sal I 
sites in the promoter of this plasmid a 35 nucleotide sequence corresponding to a naturally 
occurring binding sites for the LexA protein in the promoter of the colicin El gene (Ebina et 
25 «/, 1983) the sequence corresponds to residues 20-54 of Genbank entry g 144345). This 
sequence contains two LexA operators and is therefore referred to as '"21ex”. This reporter 
plasmid was linearised within the URA3 gene and integrated into the ura3 locus of MEY132 
to give the yeast strain MEY132-lexRE 

Transformants containing the 2 hybrid fusion constructs were grown to late log phase 
30 in 20ml selective medium (2% glucose and appropriate supplements). They were then diluted 
into 2% galactose containing medium, in the presence or absence of ligand. As controls each 
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LBD fusion was coexpressed with the LexA DBD lacking the coactivator motifs and 
similarly each motif LexA DBD fusion was coexpressed with Gal4 AD lacking the NR LBD. 
The relative expression levels of DBD fusion proteins were determined by immunodetection 
using a monoclonal antibody which recognises LexA. 

5 The estrogen receptor isoforms a and P relative interaction specificities for the 4 

motifs of SRC- la are ranked as follows, 2>4>1>3 (see Figure 5) . Glucocorticoid receptor 
specificities are as follows 4> 1=2=3. 

The yeast 2-hybrid system used in this Example utilises a single copy integrated 
reporter gene construct. This enables a quantitative comparison between the different yeast 
1 0 strains bearing the different SRC 1 a motifs. The data presented in Figure 5 suggest that motif 
3 (SRC la, 748-753) interacts very weakly, using this system, with ER. Longer exposures to 
oestradiol (not shown) do however reveal a significant interaction. It is noted that Figure 1 A 
herein clearly shows an interaction between motif 3 and ER but this interaction is significantly 
weaker than that seen with the other motifs. Any differences in this regard between Figures 5 
1 5 and 1 A can be explained by experimental design features. For example, vectors used to 
generate the Figure 5 data were low copy number (centromere containing) vectors, whereas 
vectors used to generate the Figure 1 A data were multicopy (2 micron) vectors. 
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SEQUENCE LISTING 



^ (1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Imperial Cancer Research Technology Limited 

(B) STREET: Sardinia House, Sardinia Street, 

(C) CITY: London 

(D) STATE: England 

(E) COUNTRY; Great Britain 

(F) POSTAL CODE (ZIP): WC2A 3NL 

(G) TELEPHONE: 0171 242 1136 

(H) TELEFAX: 0171 831 4991 

(ii) TITLE OF INVENTION: Chemical Compounds 

(iii) NUMBER OF SEQUENCES: 79 

20 (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

25 SOFTWARE: PatentIn Release #1.0, Version #1.30 (EPO) 

(Vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9708676.3 

(B) FILING DATE: 30-APR-1997 

30 (2) INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO; 1: 

Xaa Xaa Leu Leu 
5 

45 (2) INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Gin Ala Gin Gin Lys Ser Leu Leu Gin Gin Leu Leu Thr 
S 10 

60 (2) INFORMATION FOR SEQ ID NO: 3: 



(i) 

50 

(ii) 

55 (xi) 

Pro 

1 



(i) 

35 

(ii) 

40 (xi) 

Leu 

1 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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Lys Leu Val Gin Leu Leu Thr Thr Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

lie Leu His Arg Leu Leu Gin Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Leu Leu Gin Gin Leu Leu Thr Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Tyr Leu Glu Gly Leu Leu Met His Gin Ala 
15 10 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Leu Leu Ala Ser Leu Leu Gin Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY; linear 






(ii) MOLECULE TYPE: peptide 




5 


(XI) SEQUENCE DESCRIPTION; SEQ ID NO: 8: 
His Leu Lys Thr Leu Leu Lys Lys Ser 




10 


(2) INFORMATION FOR SEQ ID NO: 9: 




15 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 




20 


(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Gin Leu Ala Leu Leu Leu Ser Ser 
^ 5 




25 (2) INFORMATION FOR SEQ ID NO: 10: 




30 


(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 




35 


(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
Leu Leu Leu His Leu Leu Lys Ser Gin 




40 (2) INFORMATION FOR SEQ ID NO: 11; 




45 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 




50 


(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
Val Thr Leu Leu Gin Leu Leu Leu Gly 




55 ( 2 ) 


INFORMATION FOR SEQ ID NO: 12: 




60 


(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 




65 


(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Leu Gin Leu Leu Leu Gly Asn 
^ 5 




70 ( 2 ) 


INFORMATION FOR SEQ ID NO: 13; 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Leu Leu Ser Arg Leu Leu Arg Gin 
1 5 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Val Leu Lys Gin Leu Leu Leu Ser Glu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO; 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Val Leu Lys Gin Leu Leu Leu Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Glu Leu Ala Glu Leu Leu Ser Ala Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DES(TRIPTION : SEQ ID NO: 17: 
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Ser Leu Gly Pro Leu Leu Leu Glu 
1 5 

5 (2) INFORMATION FOR SEQ ID NO: 18: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Lys Leu Val Gin Leu Leu Thr Thr Thr 
1 5 

20 (2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 
__ (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

lie Leu His Arg Leu Leu Gin Glu 
1 5 

35 (2) INFORMATION FOR SEQ ID NO: 20: 



40 



(ii) 

45 (xi) 

Leu 
1 

50 (2) INFORMATION FOR SEQ ID NO: 21: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Leu Asp Glu Leu Leu Cys Pro Pro 
5 

INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 



(i) 

55 

(ii) 

60 (xi) 

Gin 

1 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Leu Arg Tyr Leu Leu Asp Lys 
5 
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(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Leu Leu Gin Gin Leu Leu Thr Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Gin Leu Ser Glu Leu Leu Arg Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 24 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Val Leu Lys Gin Leu Leu Leu Ser Glu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Ala Leu Lys Gin Leu Leu Leu Ser Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Val Ala Lys Gin Leu Leu Leu Ser Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 27: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 
{B) TYPE: amino acid 
J (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Val Val Lys Gin Leu Leu Leu Ser Glu 
1 5 

15 (2) INFORMATION FOR SEQ ID NO: 28: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Leu Lys Gin Ala Leu Leu Ser Glu 

5 

INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



25 






(ii) 

(xi) 

Val 

1 



(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Val Leu Lys Gin Val Leu Leu Ser Glu 
1 5 

45 (2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Val Leu Lys Gin Leu Ala Leu Ser Glu 
1 5 

60 (2) INFORMATION FOR SEQ ID NO: 31: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



70 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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Val Leu Lys Gin Leu Val Leu Ser Glu 
1 5 

5 {2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 9 amino acids 
(B) TYPE; amino acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Val Leu Lys Gin Leu Leu Ala Ser Glu 
1 5 

20 (2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Tyr Leu Glu Gly Leu Leu Met His Gin Ala Ala 

15 10 

35 (2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 11 amino acids 
{B) TYPE: amino acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 

45 {xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Leu Leu Ala Ser Leu Leu Gin Ser Glu Ser Ser 

15 10 

50 {2) INFORMATION FOR SEQ ID NO: 35: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

55 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: peptide 

60 {xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

His Leu Lys Thr Leu Leu Lys Lys Ser Lys Val 
1 , 5 10 

65 {2) INFORMATION FOR SEQ ID NO; 36: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 11 amino acids 
{B) TYPE: amino acid 
70 {C) STRANDBDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO 

Gin Leu Ala Leu Leu Leu Ser Ser Glu 
1 5 

10 (2) INFORMATION FOR SEQ ID NO: 37: 



: 36: 

Ala His 
10 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

Leu Leu Leu His Leu Leu Lys Ser Gin 
1 5 

25 (2) INFORMATION FOR SEQ ID NO: 38; 



37: 

Thr lie 
10 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

Leu Leu Gin Leu Leu Leu Gly His Lvs 
1 5 

40 (2) INFORMATION FOR SEQ ID NO: 39: 



38: 

Asn Glu 
10 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



50 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

Val Leu Gin Leu Leu Leu Gly Asn Pro 
1 5 

55 (2) INFORMATION FOR SEQ ID NO: 40: 



39: 

Lys Gly 
10 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



65 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

Leu Leu Ser Arg Leu Leu Arg Gin Asn 
1 5 

70 (2) INFORMATION FOR SEQ ID NO: 41: 



40: 

Gin Asp 
10 
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{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Val Leu Lys Gin Leu Leu Leu Ser Glu Asn Cys 

15 10 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Lys Leu Val Gin Leu Leu Thr Thr Thr Ala Glu 

15 10 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

lie Leu His Arg Leu Leu Gin Glu Gly Ser Pro 

15 10 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Leu Leu Arg Tyr Leu Leu Asp Lys Asp Glu Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CH7VRACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
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Leu Leu Gin Gin Leu Leu Thr Glu 
1 5 

5 (2) INFORMATION FOR SEQ ID NO: 4 6: 



10 


(i) 


SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 






(ii) 


MOLECULE TYPE: peptide 




15 


(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 


: 46: 




Lys 

1 


Leu Leu Gin Leu Leu Thr Thr Lys 
5 


Ser Asp 
10 



20 (2) INFORMATION FOR SEQ ID NO: 47: 



25 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

lie Leu His Arg Leu Leu Gin Asp Ser Ser Ser 

1 5 10 



35 (2) INFORMATION FOR SEQ ID NO: 48: 



40 



45 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 48: 

Leu Leu Arg Tyr Leu Leu Asp Lys Asp Asp Thr 

^ S 10 



50 (2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 11 amino acids 
-- (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Gin Leu Ser Glu Leu Leu Arg Gly Gly Ser Gly 
^ ^ 10 

65 (2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Gin Leu Val Leu Leu Leu His Ala His Lys Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) sequence DESCRIPTION: SEQ ID NO: 51: 

Gin Leu Ser Glu Leu Leu Arg Gly Ser Ser Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Gin Leu Val Leu Leu Leu His Ala His Lys Cys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

lie Leu Thr Ser Leu Leu Leu Asn Ser Ser Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Leu Met Asn Leu Leu Lys Asp Asn Pro Ala 

15 10 

(2) INFORMATION FOR SEQ ID NO: 55: 




wo 98/49561 



PCT/GB98/01238 



- 40 - 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Thr Leu Arg Ser Leu Leu Leu Asn Pro His Leu 
1 5 10 

15 (2) INFORMATION FOR SEQ ID NO: 56: 



( i ) SEQUENCE CHARACTER I STI CS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



{ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Arg Leu Ala Val Leu Leu Pro Gly Arg His Pro 
1 5 10 

30 (2) INFORMATION FOR SEQ ID NO: 57: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



Glu 

1 



Leu His Asn Leu Leu Glu Val Val Ser Gin 

5 10 



45 (2) INFORMATION FOR SEQ ID NO: 58: 



50 


(i) 


SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 






(ii) 


MOLECULE TYPE: peptide 




55 


(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 


: 58: 




Thr 

1 


Leu Arg Asp Leu Leu Thr Thr Thr 
5 


Ala Gly 
10 



60 (2) INFORMATION FOR SEQ ID NO: 59: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



70 
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Phe Leu Asp Phe Leu Leu Gly Phe Ser Ala Gly 
1 5 10 

5 {2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Val Leu Glu Leu Leu Leu Arg Ala Gly Ala Asn 
15 10 

20 (2) INFORMATION FOR SEQ ID NO: 61: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
<B) TYPE: amino acid 
25 <C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



lie Leu Ala Arg Leu Leu Arg Ala His Gly Ala 
1 5 10 

35 (2) INFORMATION FOR SEQ ID NO: 62: 



40 



45 



( i ) S EQUEN CE CHARACTER I ST I CS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



Ala Leu Gin Asp Leu Leu Arg Thr Leu Lys Ser 
15 10 

50 (2) INFORMATION FOR SEQ ID NO: 63: 



55 



60 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



Ala Leu Gin Asn Leu Leu Arg Thr Leu Arg Ser 
15 10 

65 (2) INFORMATION FOR SEQ ID NO: 64: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

70 (C) STRANDEDNESS: single . 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Lys Leu Leu Gin Leu Leu Thr Cys Ser Ser Asp 
1 5 10 

10 (2) INFORMATION FOR SEQ ID NO: 65: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 





(ii) 


MOLECULE 


TYPE: peptide 




20 


(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID NO: 


: 65: 




He 


Leu His 


Lys Leu Leu Gin Asn Gly 


Asn Ser 




1 




5 


10 



25 (2) INFORMATION FOR SEQ ID NO: 66: 



( i ) SEQUENCE CHARACTERI STICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Leu Leu Arg Tyr Leu Leu Asp Arg Asp Asp Pro 
1 5 10 

40 (2) INFORMATION FOR SEQ ID NO: 67: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 eimino acids 

(B) TYPE: amino acid 

4^ (C) STRANDEDNESS r single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Gin Leu Tyr Ser Leu Leu Gly Gin Phe Asn Cys 
^5 10 

55 (2) INFORMATION FOR SEQ ID NO: 68: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Glu Leu Glu Asn Leu Leu Gin Gin Gly Gly 
1 5 10 

70 (2) INFORMATION FOR SEQ ID NO: 69: 




wo 98/49561 



PCT/GB98/01238 



- 43 - 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Val Leu Gin Lys Leu Leu Lys Glu Lys Asp 

15 10 

15 (2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
<B) TYPE: amino acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Glu Leu Asn Gin Leu Leu Asn Ala Val Lys 

15 10 

30 (2) INFORMATION FOR SEQ ID NO: 71: 

( i ) SEQUENCE CHARACTERI STICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO; 71: 

Val Leu Lys Asp Leu Leu Lys Gin 
1 5 

45 (2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Pro Gin Ala Gin Gin Lys Ser Leu Leu Gin Gin Ala Ala Thr 
1 5 10 

60 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 
65 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



70 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Ser Gin Thr Ser His Lys Leu Val Gin Leu Leu Thr Thr Thr 
5 1 5 10 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

LENGTH: 13 amfiino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Thr Ala Arg His Lys lie Leu His Arg Leu Leu Gin Glu 
15 10 . 

(2) INFORMATION FOR SEQ ID NO: 75: 

( i ) SEQUEN CE CHARACTER I ST I CS : 

(A) LENGTH: 13 amino acids 
{B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Ser Lys Asp His Gin Leu Leu Arg Tyr Leu Leu Asp Lys 

1 5 10 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Gin Ala Gin Gin Lys Ser Leu Leu Gin Gin Leu Leu Thr Glu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 77: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

GAATTCCTGC AGCCCGGGGT CGACACTAGT TAACTAGCGG CCGC 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



44 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: other nucleic acid 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

AAGCTTCCAC CATGGTGCCA AAGAAGAAAC GTAAAGTT 38 

(2) INFORMATION FOR SEQ ID NO: 79: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

20 

TCTAGACTGC AGACTAGTAG ATCTCCCGGG GCGGCCGC 38 

(2) INFORMATION FOR SEQ ID NO: 80: 

25 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

35 Glu Phe Leu Gin Pro Gly Val Asp Thr Ser 

15 10 

(2) INFORMATION FOR SEQ ID NO: 81: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 81: 



50 



Met Val Pro Lys Lys Lys Arg Lys Val 
1 5 
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CLAIMS 

1 . A method for identifying inhibitor compounds capable of reducing the interaction 
between: 

5 a) a first region which is a signature motif on a nuclear protein, and 
b) a second region which is that part of a nuclear receptor which is capable of interacting 
with the nuclear protein through binding to the signature motif, 
wherein: 

the nuclear protein is a bridging factor that is responsible for the interaction between a 

10 hganded nuclear receptor and a transcription initiation complex involved in regulation of gene 
expression; 

the nuclear receptor is a transcription factor; 

the signature motif is a short sequence of amino acid residues which is the key structural 
element of a nuclear protein which binds to a Hganded nuclear receptor as part of the process 
15 of the activation or repression of target genes; and 
in which the method comprises taking; 

i) the p>otential inhibitor compound; 

ii) the Hganded nuclear receptor or a fragment thereof in which the fragment comprises 
the second region defined in this claim in b) above; 

20 iii) a fragment comprising a signature motif of the nuclear protein; and 

iv) detecting the presence or absence of inhibition of the interaction between ii) and iii). 

2 A method according to claim 1 in which the signature motif is B ' XXLL in which B ' is 
any natural hydrophobic amino acid, L is leucine and X independently represents any natural 
amino acid. 

25 3 A method according to claim 2 in which B« is leucine or valine. 

4 A method according to claim 3 in which B' is leucine. 

5 A method according to any one of claims 2-4 in which the signature motif is further 

defined as BZB'XXLL wherein B2 is a hydrophobic amino acid. 

6 A method according to claim 5 in which B2 is selected from the group consisting of 
30 isoleucine, leucine, methionine, phenylalanine, tryptophan, tyrosine and valine. 
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7. A method according to any one of claims 1-6 in which the nuclear protein is a 
coactivator. 

8. A method according to claim 7 in which the coactivator is selected from the group 
consisting of RIP 140, SRC-1, TIF2, CBP, p300, TIFl, Tripl, Trip2, Trip3, Trip4, TripS, 

5 Trips, Trip9, p/CIP, ARA70 & Trip230. 

9. A method according to any one of claims 1-6 in which the transcription factor is a 
steroid hormone receptor. 

1 0. A method according to claim 9 in which the steroid hormone receptor is selected from 
the group consisting of oestrogen receptor, progesterone receptor, androgen receptor and 

1 0 glucocorticoid receptor. 

11. A method according to claim 1 0 in which the steroid hormone receptor is oestrogen 
receptor. 

12. A method according to any preceding claim wherein the method is in the form of a 2- 
hybrid assay system. 

15 13 A method according to any preceding claim wherein the potential inhibitor is in the 
form of a peptide library based on a signature motif as defined in any one of claims 2-6. 

1 4. A novel inhibitor identified according to the method defined in any one of claims 1-13 
which reduces the interaction between 

a) a first region which is a signature motif on a nuclear protein, and 
20 b) a second region which is that part of a nuclear receptor which is capable of interacting 
with the nuclear protein through binding to the signature motif, 
wherein: 

the nuclear protein is a bridging factor that is responsible for the interaction between a 
liganded nuclear receptor and the transcription initiation complex involved in regulation of 
25 gene expression; 

the nuclear receptor is a transcription factor; 

the signature motif is a short sequence of amino acid residues which is the key structural 
element of a nuclear protein which binds to a l iganded nuclear receptor as part of the process 
of the activation or repression of target genes. 

30 15 An inhibitor according to claim 14 which is a peptide of less than 15 amino acid 
residues comprising the signature motif defined in any one of claims 1-6. 
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16 An inhibitor according to claim 15 selected from the group consisting of 
PQAQQKSLLQQLLT (SEQ ID NO: 2), KLVQLLTTT (SEQ ID NO: 3), ILHRLLQE (SEQ 
ID NO: 4)andLLQQLLTE(SEQ IDNO: 5). 

1 7 An inhibitor according to claim 14 comprising an antibody which specifically binds 
5 to a signature motif on a nuclear protein. 

18 A pharmaceutical composition which comprises an inhibitor as defined in any one of 
claims 14-1 7 or a pharmaceutically-acceptable salt thereof, in association with a 
pharmaceutically-acceptable diluent or carrier. 

1 9 A method of mapping nuclear receptor interaction domains in nuclear proteins in 

10 which the method comprises analysis of the sequence of a nuclear protein for the presence of 
signature motifs as defined in any one of claims 1-6 in order to identity an interaction domain 
or a potential interaction domain. 
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Fig.2A. 
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Fig.2C. 
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Fig.SA. 
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Fig.4. 
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